Penalized cluster analysis with applications to family data

نویسندگان

  • Yixin Fang
  • Junhui Wang
چکیده

Cluster analysis is the assignment of observations into clusters so that observations in the same cluster are similar in some sense, and many clustering methods have been developed. However, these methods cannot be applied to family data, which possess intrinsic familial structure. To take the familial structure into account, we propose a form of penalized cluster analysis with a tuning parameter controlling its influence. The tuning parameter can be selected based on the concept of clustering stability. The method can also be applied to other cluster data such as panel data. The method is illustrated via simulations and an application to a family study of asthma.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Penalized Estimators in Cox Regression Model

The proportional hazard Cox regression models play a key role in analyzing censored survival data. We use penalized methods in high dimensional scenarios to achieve more efficient models. This article reviews the penalized Cox regression for some frequently used penalty functions. Analysis of medical data namely ”mgus2” confirms the penalized Cox regression performs better than the cox regressi...

متن کامل

Non-hierarchical Clustering with Rival Penalized Competitive Learning for Information Retrieval

In large content-based image database applications, e cient information retrieval depends heavily on good indexing structures of the extracted features. While indexing techniques for text retrieval are well understood, e cient and robust indexing methodology for image retrieval is still in its infancy. In this paper, we present a non-hierarchical clustering scheme for index generation using the...

متن کامل

Network-based clustering with mixtures of L1-penalized Gaussian graphical models: an empirical investigation

In many applications, multivariate samples may harbor previously unrecognized heterogeneity at the level of conditional independence or network structure. For example, in cancer biology, disease subtypes may differ with respect to subtype-specific interplay between molecular components. Then, both subtype discovery and estimation of subtype-specific networks present important and related challe...

متن کامل

Rival Penalization Controlled Competitive Learning for Data Clustering with Unknown Cluster Number

Conventional clustering algorithms such as k-means (Forgy 1965, MacQueen 1967) need to know the exact cluster number k∗ before performing data clustering. Otherwise, they will lead to a poor clustering performance. Unfortunately, it is often hard to determine k∗ in advance in many practical problems. Under the circumstances, Xu et al. in 1993 proposed an approach named Rival Penalized Competiti...

متن کامل

Comparison of Ordinal Response Modeling Methods like Decision Trees, Ordinal Forest and L1 Penalized Continuation Ratio Regression in High Dimensional Data

Background: Response variables in most medical and health-related research have an ordinal nature. Conventional modeling methods assume predictor variables to be independent, and consider a large number of samples (n) compared to the number of covariates (p). Therefore, it is not possible to use conventional models for high dimensional genetic data in which p > n. The present study compared th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 55  شماره 

صفحات  -

تاریخ انتشار 2011